FASHION: Fault-Aware Self-Healing Intelligent On-chip Network

نویسندگان

  • Pengju Ren
  • Michel A. Kinsy
  • Mengjiao Zhu
  • Shreeya Khadka
  • Mihailo Isakov
  • Aniruddh Ramrakhyani
  • Tushar Krishna
  • Nanning Zheng
چکیده

To avoid packet loss and deadlock scenarios that arise due to faults or power gating in multicore and many-core systems, the network-onchip needs to possess resilient communication and load-balancing properties. In this work, we introduce the Fashion router, a self-monitoring and self-reconfiguring design that allows for the on-chip network to dynamically adapt to component failures. First, we introduce a distributed intelligence unit, called Self-Awareness Module (SAM), which allows the router to detect permanent component failures and build a network connectivity map. Using local information, SAM adapts to faults, guarantees connectivity and deadlock-free routing inside the maximal connected subgraph and keeps routing tables up-to-date. Next, to reconfigure network links or virtual channels around faulty/power-gated components, we add bidirectional link and unified virtual channel structure features to the Fashion router. This version of the router, named Ex-Fashion, further mitigates the negative system performance impacts, leads to larger maximal connected subgraph and sustains a relatively high degree of fault-tolerance. To support the router, we develop a fault diagnosis and recovery algorithm executed by the Built-In Self-Test, self-monitoring, and self-reconfiguration units at runtime to provide fault-tolerant system functionalities. The Fashion router places no restriction on topology, position or number of faults. It drops 54.3∼55.4% fewer nodes for same number of faults (between 30 and 60 faults) in an 8x8 2D-mesh over other state-of-the-art solutions. It is scalable and efficient. The area overheads are 2.311% and 2.659% when implemented in 8x8 and 16x16 2D-meshes using the TSMC 65nm library at 1.38GHz clock frequency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...

متن کامل

CAFT: Cost-aware and Fault-tolerant routing algorithm in 2D mesh Network-on-Chip

By increasing, the complexity of chips and the need to integrating more components into a chip has made network –on- chip known as an important infrastructure for network communications on the system, and is a good alternative to traditional ways and using the bus. By increasing the density of chips, the possibility of failure in the chip network increases and providing correction and fault tol...

متن کامل

A Routing-Aware Simulated Annealing-based Placement Method in Wireless Network on Chips

Wireless network on chip (WiNoC) is one of the promising on-chip interconnection networks for on-chip system architectures. In addition to wired links, these architectures also use wireless links. Using these wireless links makes packets reach destination nodes faster and with less power consumption. These wireless links are provided by wireless interfaces in wireless routers. The WiNoC archite...

متن کامل

Cost-aware Topology Customization of Mesh-based Networks-on-Chip

Nowadays, the growing demand for supporting multiple applications causes to use multiple IPs onto the chip. In fact, finding truly scalable communication architecture will be a critical concern. To this end, the Networks-on-Chip (NoC) paradigm has emerged as a promising solution to on-chip communication challenges within the silicon-based electronics. Many of today’s NoC architectures are based...

متن کامل

Optimal Self-healing of Smart Distribution Grids Based on Spanning Trees to Improve System Reliability

In this paper, a self-healing approach for smart distribution network is presented based on Graph theory and cut sets. In the proposed Graph theory based approach, the upstream grid and all the existing microgrids are modeled as a common node after fault occurrence. Thereafter, the maneuvering lines which are in the cut sets are selected as the recovery path for alternatives networks by making ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1702.02313  شماره 

صفحات  -

تاریخ انتشار 2017